[pull] master from git:master#81
Merged
pull[bot] merged 11 commits intoturkdevops:masterfrom Jul 24, 2025
Merged
Conversation
A recent commit, d9cb0e6 (fast-export, fast-import: add support for signed-commits, 2025-03-10), added support for signed commits to fast-export and fast-import. When a signed commit is processed, fast-export can output either "gpgsig sha1" or "gpgsig sha256" depending on whether the signed commit uses the SHA-1 or SHA-256 Git object format. However, this implementation has a number of limitations: - the output format was not properly described in the documentation, - the output format is not very informative as it doesn't even say if the signature is an OpenPGP, an SSH, or an X509 signature, - the implementation doesn't support having both one signature on the SHA-1 object and one on the SHA-256 object. Let's improve on these limitations by improving fast-export and fast-import so that: - all the signatures are exported, - at most one signature on the SHA-1 object and one on the SHA-256 are imported, - if there is more than one signature on the SHA-1 object or on the SHA-256 object, fast-import emits a warning for each additional signature, - the output format is "gpgsig <git-hash-algo> <signature-format>", where <git-hash-algo> is the Git object format as before, and <signature-format> is the signature type ("openpgp", "x509", "ssh" or "unknown"), - the output is properly documented. About the output format: - <git-hash-algo> allows to know which representation of the commit was signed (the SHA-1 or the SHA-256 version) which helps with both signature verification and interoperability between repos with different hash functions, - <signature-format> helps tools that process the fast-export stream, so they don't have to parse the ASCII armor to identify the signature type. It could be even better to be able to import more than one signature on the SHA-1 object and on the SHA-256 object, but other parts of Git don't handle that well for now, so this is left for future improvements. Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compiling Git fails on Amazon Linux 2 when using GCC 7.3.1 with the
following compiler error:
In file included from compat/posix.h:449:0,
from git-compat-util.h:26,
from daemon.c:3:
compat/../sane-ctype.h:29:60: error: expected expression before ']' token
#define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
^
compat/../sane-ctype.h:29:72: error: expected ')' before '!=' token
#define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
^
compat/../sane-ctype.h:29:60: error: expected expression before ']' token
#define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
^
... lots of similar lines ...
compat/../sane-ctype.h:45:50: error: expected declaration specifiers or '...' before numeric constant
#define toupper(x) sane_case((unsigned char)(x), 0)
^
/usr/include/ctype.h:142:12: error: expected identifier or '(' before 'int'
extern int isascii (int __c) __THROW;
^
compat/../sane-ctype.h:30:26: error: expected ')' before '&' token
#define isascii(x) (((x) & ~0x7f) == 0)
^
compat/../sane-ctype.h:30:35: error: expected ')' before '==' token
#define isascii(x) (((x) & ~0x7f) == 0)
^
In file included from /usr/include/features.h:423:0,
from /usr/include/unistd.h:25,
from compat/posix.h:90,
from git-compat-util.h:26,
from daemon.c:3:
compat/../sane-ctype.h:44:30: error: expected declaration specifiers or '...' before '(' token
#define tolower(x) sane_case((unsigned char)(x), 0x20)
^
compat/../sane-ctype.h:44:50: error: expected declaration specifiers or '...' before numeric constant
#define tolower(x) sane_case((unsigned char)(x), 0x20)
^
compat/../sane-ctype.h:45:30: error: expected declaration specifiers or '...' before '(' token
#define toupper(x) sane_case((unsigned char)(x), 0)
^
compat/../sane-ctype.h:45:50: error: expected declaration specifiers or '...' before numeric constant
#define toupper(x) sane_case((unsigned char)(x), 0)
^
This error bisect back to 75a044f (git-compat-util.h: split out
POSIX-emulating bits, 2025-02-18), where lots of bits got split out of
"git-compat-util.h" into a new "compat/posix.h" header.
The compiler error isn't immediately obvious, doubly so because the
actual errors are ~3x as long as the above snippet. But what happens
here is that we transitively include <ctype.h> after we have included
our own "sane-ctype.h" header. Consequently, the function declarations
that exist in <ctype.h> for isascii(3p) et al will be mangled by our
macros of the same type. The result is of course completely broken.
It's unclear why this issue only happens on Amazon Linux 2. My guess is
that it's either specific to the compiler version or specific to the
glibc version. We don't explicitly include <ctypes.h> anywhere, but it's
being transitively included. So chances are that later versions of the
toolchain reorganized their headers so that <ctypes.h> is not included
transitively anymore.
Fix the issue by explicitly including <ctype.h> in "sane-ctype.h". This
ensures that the header guards will be activated and that any subsequent
include of the same header will become a no-op. With this we can then
safely override the function declarations with our own macros.
Reported-by: Stan Hu <stanhu@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In bloom.h, murmur3_seeded_v2() is exported for the use of test murmur3 hash. To clarify that murmur3_seeded_v2() is exported solely for testing purposes, a new helper function test_murmur3_seeded() was added instead of exporting murmur3_seeded_v2() directly. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git code style requires that functions operating on a struct S should be named in the form S_verb. However, the functions operating on struct bloom_key do not follow this convention. Therefore, fill_bloom_key() and clear_bloom_key() are renamed to bloom_key_fill() and bloom_key_clear(), respectively. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, we stored bloom keys in a flat array and marked a commit as NOT TREESAME if any key reported "definitely not changed". To support multiple pathspec items, we now require that for each pathspec item, there exists a bloom key reporting "definitely not changed". This "for every" condition makes a flat array insufficient, so we introduce a new structure to group keys by a single pathspec item. `struct bloom_keyvec` is introduced to replace `struct bloom_key *` and `bloom_key_nr`. And because we want to support multiple pathspec items, we added a bloom_keyvec * and a bloom_keyvec_nr field to `struct rev_info` to represent an array of bloom_keyvecs. This commit still optimize only one pathspec item, thus bloom_keyvec_nr can only be 0 or 1. New bloom_keyvec_* functions are added to create and destroy a keyvec. bloom_filter_contains_vec() is added to check if all key in keyvec is contained in a bloom filter. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When preparing to use bloom filters in a revision walk, Git populates a boom_keyvec with an array of bloom keys for the components of a path. Before we create the ability to map multiple pathspecs to multiple bloom_keyvecs, extract the conversion from a pathspec to a bloom_keyvec into its own helper method. This simplifies the state that persists in prepare_to_use_bloom_filter() as well as makes the future change much simpler. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
To enable optimize multiple pathspec items in revision traversal, return 0 if all pathspec item is literal in forbid_bloom_filters(). Add for loops to initialize and check each pathspec item's bloom_keyvec when optimization is possible. Add new test cases in t/t4216-log-bloom.sh to ensure - consistent results between the optimization for multiple pathspec items using bloom filter and the case without bloom filter optimization. - does not use bloom filter if any pathspec item is not literal. With these optimizations, we get some improvements for multi-pathspec runs of 'git log'. First, in the Git repository we see these modest results: Benchmark 1: old Time (mean ± σ): 73.1 ms ± 2.9 ms Range (min … max): 69.9 ms … 84.5 ms 42 runs Benchmark 2: new Time (mean ± σ): 55.1 ms ± 2.9 ms Range (min … max): 51.1 ms … 61.2 ms 52 runs Summary 'new' ran 1.33 ± 0.09 times faster than 'old' But in a larger repo, such as the LLVM project repo below, we get even better results: Benchmark 1: old Time (mean ± σ): 1.974 s ± 0.006 s Range (min … max): 1.960 s … 1.983 s 10 runs Benchmark 2: new Time (mean ± σ): 262.9 ms ± 2.4 ms Range (min … max): 257.7 ms … 266.2 ms 11 runs Summary 'new' ran 7.51 ± 0.07 times faster than 'old' Signed-off-by: Derrick Stolee <stolee@gmail.com> [ly: rename convert_pathspec_to_filter() to convert_pathspec_to_bloom_keyvec()] Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lift the limitation to use changed-path filter in "git log" so that it can be used for a pathspec with multiple literal paths. * ly/changed-paths-traversal: bloom: optimize multiple pathspec items in revision revision: make helper for pathspec to bloom keyvec bloom: replace struct bloom_key * with struct bloom_keyvec bloom: rename function operates on bloom_key bloom: add test helper to return murmur3 hash
Our <sane-ctype.h> header file relied on that the system-supplied <ctype.h> header is not later included, which would override our macro definitions, but "amazon linux" broke this assumption. Fix this by preemptively including <ctype.h> near the beginning of <sane-ctype.h> ourselves. * ps/sane-ctype-workaround: sane-ctype: fix compiler error on Amazon Linux 2
Clean up the way how signature on commit objects are exported to and imported from fast-import stream. * cc/fast-import-export-signature-names: fast-(import|export): improve on commit signature output format
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.3)
Can you help keep this open source service alive? 💖 Please sponsor : )